Explainability of Graph Neural Networks (GNNs) is critical to various GNN applications but remains an open challenge. A convincing explanation should be both necessary and sufficient simultaneously. However, existing GNN explaining approaches focus on only one of the two aspects, necessity or sufficiency, or a trade-off between the two. To search for the most necessary and sufficient explanation, the Probability of Necessity and Sufficiency (PNS) can be applied since it can mathematically quantify the necessity and sufficiency of an explanation. Nevertheless, the difficulty of obtaining PNS due to non-monotonicity and the challenge of counterfactual estimation limits its wide use. To address the non-identifiability of PNS, we resort to a lower bound of PNS that can be optimized via counterfactual estimation, and propose Necessary and Sufficient Explanation for GNN (NSEG) via optimizing that lower bound. Specifically, we employ nearest neighbor matching to generate counterfactual samples for the features, which is different from the random perturbation. In particular, NSEG combines the edges and node features to generate an explanation, where the common edge explanation is a special case of the combined explanation. Empirical study shows that NSEG achieves excellent performance in generating the most necessary and sufficient explanations among a series of state-of-the-art methods.
translated by 谷歌翻译
Deep neural operators can learn nonlinear mappings between infinite-dimensional function spaces via deep neural networks. As promising surrogate solvers of partial differential equations (PDEs) for real-time prediction, deep neural operators such as deep operator networks (DeepONets) provide a new simulation paradigm in science and engineering. Pure data-driven neural operators and deep learning models, in general, are usually limited to interpolation scenarios, where new predictions utilize inputs within the support of the training set. However, in the inference stage of real-world applications, the input may lie outside the support, i.e., extrapolation is required, which may result to large errors and unavoidable failure of deep learning models. Here, we address this challenge of extrapolation for deep neural operators. First, we systematically investigate the extrapolation behavior of DeepONets by quantifying the extrapolation complexity via the 2-Wasserstein distance between two function spaces and propose a new behavior of bias-variance trade-off for extrapolation with respect to model capacity. Subsequently, we develop a complete workflow, including extrapolation determination, and we propose five reliable learning methods that guarantee a safe prediction under extrapolation by requiring additional information -- the governing PDEs of the system or sparse new observations. The proposed methods are based on either fine-tuning a pre-trained DeepONet or multifidelity learning. We demonstrate the effectiveness of the proposed framework for various types of parametric PDEs. Our systematic comparisons provide practical guidelines for selecting a proper extrapolation method depending on the available information, desired accuracy, and required inference speed.
translated by 谷歌翻译
Script event prediction aims to predict the subsequent event given the context. This requires the capability to infer the correlations between events. Recent works have attempted to improve event correlation reasoning by using pretrained language models and incorporating external knowledge~(e.g., discourse relations). Though promising results have been achieved, some challenges still remain. First, the pretrained language models adopted by current works ignore event-level knowledge, resulting in an inability to capture the correlations between events well. Second, modeling correlations between events with discourse relations is limited because it can only capture explicit correlations between events with discourse markers, and cannot capture many implicit correlations. To this end, we propose a novel generative approach for this task, in which a pretrained language model is fine-tuned with an event-centric pretraining objective and predicts the next event within a generative paradigm. Specifically, we first introduce a novel event-level blank infilling strategy as the learning objective to inject event-level knowledge into the pretrained language model, and then design a likelihood-based contrastive loss for fine-tuning the generative model. Instead of using an additional prediction layer, we perform prediction by using sequence likelihoods generated by the generative model. Our approach models correlations between events in a soft way without any external knowledge. The likelihood-based prediction eliminates the need to use additional networks to make predictions and is somewhat interpretable since it scores each word in the event. Experimental results on the multi-choice narrative cloze~(MCNC) task demonstrate that our approach achieves better results than other state-of-the-art baselines. Our code will be available at \url{https://github.com/zhufq00/mcnc}.
translated by 谷歌翻译
Recent success of vision transformers has inspired a series of vision backbones with novel feature transformation paradigms, which report steady performance gain. Although the novel feature transformation designs are often claimed as the source of gain, some backbones may benefit from advanced engineering techniques, which makes it hard to identify the real gain from the key feature transformation operators. In this paper, we aim to identify real gain of popular convolution and attention operators and make an in-depth study of them. We observe that the main difference among these feature transformation modules, e.g., attention or convolution, lies in the way of spatial feature aggregation, or the so-called "spatial token mixer" (STM). Hence, we first elaborate a unified architecture to eliminate the unfair impact of different engineering techniques, and then fit STMs into this architecture for comparison. Based on various experiments on upstream/downstream tasks and the analysis of inductive bias, we find that the engineering techniques boost the performance significantly, but the performance gap still exists among different STMs. The detailed analysis also reveals some interesting findings of different STMs, such as effective receptive fields and invariance tests. The code and trained models will be publicly available at https://github.com/OpenGVLab/STM-Evaluation
translated by 谷歌翻译
语义细分是一种关键技术,涉及高分辨率遥感(HRS)图像的自动解释,并引起了遥感社区的广泛关注。由于其层次表示能力,深度卷积神经网络(DCNN)已成功应用于HRS图像语义分割任务。但是,对大量培训数据的严重依赖性以及对数据分布变化的敏感性严重限制了DCNNS在HRS图像的语义分割中的潜在应用。这项研究提出了一种新型的无监督域适应性语义分割网络(MemoryAdaptnet),用于HRS图像的语义分割。 MemoryAdaptnet构建了一种输出空间对抗学习方案,以弥合源域和目标域之间的域分布差异,并缩小域移位的影响。具体而言,我们嵌入了一个不变的特征内存模块来存储不变的域级上下文信息,因为从对抗学习获得的功能仅代表当前有限输入的变体特征。该模块由类别注意力驱动的不变域级上下文集合模块集成到当前伪不变功能,以进一步增强像素表示。基于熵的伪标签滤波策略用于更新当前目标图像的高额伪不变功能的内存模块。在三个跨域任务下进行的广泛实验表明,我们提出的记忆ADAPTNET非常优于最新方法。
translated by 谷歌翻译
物理信息的神经网络(PINN)已证明是解决部分微分方程(PDE)的前进和反问题的有效工具。 PINN将PDE嵌入神经网络的丢失中,并在一组散射的残留点上评估该PDE损失。这些点的分布对于PINN的性能非常重要。但是,在现有的针对PINN的研究中,仅使用了一些简单的残留点抽样方法。在这里,我们介绍了两类采样的全面研究:非自适应均匀抽样和适应性非均匀抽样。我们考虑了六个均匀的采样,包括(1)稳定的均匀网格,(2)均匀随机采样,(3)拉丁语超立方体采样,(4)Halton序列,(5)Hammersley序列和(6)Sobol序列。我们还考虑了用于均匀抽样的重采样策略。为了提高采样效率和PINN的准确性,我们提出了两种新的基于残余的自适应抽样方法:基于残留的自适应分布(RAD)和基于残留的自适应改进,并具有分布(RAR-D),它们会动态地改善基于训练过程中PDE残差的剩余点。因此,我们总共考虑了10种不同的采样方法,包括6种非自适应均匀抽样,重采样的均匀抽样,两种提议的自适应抽样和现有的自适应抽样。我们广泛测试了这些抽样方法在许多设置中的四个正向问题和两个反问题的性能。我们在本研究中介绍的数值结果总结了6000多个PINN的模拟。我们表明,RAD和RAR-D的提议的自适应采样方法显着提高了PINN的准确性,其残留点较少。在这项研究中获得的结果也可以用作选择抽样方法的实用指南。
translated by 谷歌翻译
基于得分的扩散生成模型(SDGM)已实现了SOTA FID导致未配对的图像到图像翻译(I2i)。但是,我们注意到现有方法完全忽略了源域中的培训数据,从而导致了未配对I2i的次优解决方案。为此,我们提出了能源引导的随机微分方程(EGSDE),该方程采用了在源和目标域上鉴定的能量函数,以指导鉴定的SDE推理过程,以实现现实和忠实的不成对的I2i。在两个功能提取器的基础上,我们仔细设计了能量功能,以鼓励传输的图像保留独立于域的特征和丢弃域特异性域。此外,我们提供了EGSDE作为专家的产品的替代解释,其中三位专家(对应于SDE和两个功能提取器)中的每一个都仅有助于忠诚或现实主义。从经验上讲,我们将EGSDE与三个公认的未配对的I2I任务在四个指标下进行的大型基线进行了比较。 EGSDE不仅在几乎所有设置中都始终优于现有的基于SDGMS的方法,而且还取得了SOTA现实主义的结果​​(例如,猫在狗到狗中的65.82的FID为65.82,而在AFHQ上野生对狗的FID为59.75),而无需损害忠实的表现。
translated by 谷歌翻译
基于深度学习的面部识别模型容易受到对抗攻击的影响。为了遏制这些攻击,大多数防御方法旨在提高对抗性扰动的识别模型的鲁棒性。但是,这些方法的概括能力非常有限。实际上,它们仍然容易受到看不见的对抗攻击。深度学习模型对于一般的扰动(例如高斯噪音)相当强大。一种直接的方法是使对抗性扰动失活,以便可以轻松地将它们作为一般扰动处理。在本文中,提出了一种称为扰动失活(PIN)的插件对抗防御方法,以使对抗防御的对抗性扰动灭活。我们发现,不同子空间中的扰动对识别模型有不同的影响。应该有一个称为免疫空间的子空间,其中扰动对识别模型的不利影响要比其他子空间更少。因此,我们的方法估计了免疫空间,并通过将它们限制在此子空间中来使对抗性扰动失活。可以将所提出的方法推广到看不见的对抗扰动,因为它不依赖于特定类型的对抗攻击方法。这种方法不仅优于几种最先进的对抗防御方法,而且还通过详尽的实验证明了卓越的概括能力。此外,提出的方法可以成功地应用于四个商业API,而无需额外的培训,这表明可以轻松地将其推广到现有的面部识别系统。源代码可从https://github.com/renmin1991/perturbation in-inactivate获得
translated by 谷歌翻译
全向图像和视频可以在虚拟现实(VR)环境中提供真实世界场景的沉浸式体验。我们在本文中介绍了一项感知全向图像质量评估(IQA)研究,因为在VR环境下提供良好的经验非常重要。我们首先建立一个全向IQA(OIQA)数据库,其中包括16个源图像和320个失真的图像,这些图像被4种通常遇到的失真类型降解,即JPEG压缩,JPEG2000压缩,高斯模糊和高斯噪声。然后,在VR环境中的OIQA数据库上进行了主观质量评估研究。考虑到人类只能在VR环境中的一个运动中看到场景的一部分,因此视觉注意力变得极为重要。因此,我们还在质量评级实验过程中跟踪头部和眼动数据。原始和扭曲的全向图像,主观质量评级以及头部和眼动数据构成了OIQA数据库。在OIQA数据库上测试了最先进的全参考(FR)IQA测量,并进行了一些与传统IQA不同的新观察结果。
translated by 谷歌翻译
在药物发现中,具有所需生物活性的新分子的合理设计是一项至关重要但具有挑战性的任务,尤其是在治疗新的靶家庭或研究靶标时。在这里,我们提出了PGMG,这是一种用于生物活化分子产生的药效团的深度学习方法。PGMG通过药理的指导提供了一种灵活的策略,以使用训练有素的变异自动编码器在各种情况下生成具有结构多样性的生物活性分子。我们表明,PGMG可以在给定药效团模型的情况下生成匹配的分子,同时保持高度的有效性,独特性和新颖性。在案例研究中,我们证明了PGMG在基于配体和基于结构的药物从头设计以及铅优化方案中生成生物活性分子的应用。总体而言,PGMG的灵活性和有效性使其成为加速药物发现过程的有用工具。
translated by 谷歌翻译